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Abstract — Geosocial networks are online social networks cen- 
tered on the locations of subscribers and businesses. Providing 
input to targeted advertising, profiling social network users 
becomes an important source of revenue. Its natural reliance 
on personal information introduces a trade-off between user 
privacy and incentives of participation for businesses and geoso- 
cial network providers. In this paper we introduce location 
centric profiles (LCPs), aggregates built over the profiles of users 
present at a given location. We introduce Profile , a suite 
of mechanisms that construct LCPs in a private and correct 
manner. We introduce iSafe, a novel, context aware public safety 
application built on Profile . Our Android and browser plugin 
implementations show that ProfiLh is efficient: the end-to-end 
overhead is small even under strong correctness assurances. 

I. Introduction 

Online social networks have become a significant source of 
personal information. Facebook alone is used by more than 
1 out of 8 people today. Social network users voluntarily 
reveal a wealth of personal data, including age, gender, contact 
information, preferences and status updates. A recent addition 
to this space, geosocial networks (GSNs) such as Yelp [1|, 
Foursquare (|2l or Facebook Places (|3], further provide access 
even to personal locations, through check-ins performed by 
users at visited venues. 

From the user perspective, personal information allows 
GSN providers to offer targeted advertising and venue own- 
ers to promote their business through spatio-temporal incen- 
tives (e.g., rewarding frequent customers through accumulated 
badges). The profitability of social network providers and 
participating businesses rests on their ability to collect, build 
and capitalize upon customer and venue profiles. Profiles 
are built based on user information - the more detailed the 
better Providing personal information exposes however users 
to significant risks, as social networks have been shown to 
leak |4| and even sell fSl user data to third parties. Conversely, 
from the provider and business perspective, being denied 
access to user information discourages participation. There 
exists therefore a conflict between the needs of users and those 
of providers and participating businesses; Without privacy 
people may be reluctant to use geosocial networks, without 
feedback the provider and businesses have no incentive to 
participate. 

In this paper we take first steps toward breaking this dead- 
lock, by introducing the concept of location centric profiles 



(LCPs). LCPs are aggregate statistics built from the profiles 
of (i) users that have visited a certain location or (ii) a set of 
co-located users. 

We introduce ProfiL/j , a framework that allows the 
construction of LCPs based on the profiles of present users, 
while ensuring the privacy and correctness of participants. 
Informally, we define privacy as the inability of venues and the 
GSN provider to accurately learn user information, including 
even anonymized location trace profiles. Thus, location privacy 
is an inherent Profil^; requirement. 

Correctness is a by-product of privacy: under the cover 
of privacy users may try to bias LCPs. We consider two 
correctness components (i) location correctness - users can 
only contribute to LCPs of venues where they are located and 
(ii) LCP correctness - users can modify LCPs only in a pre- 
defined manner Location correctness is an issue of particular 
concern. The use of financial incentives by venues to reward 
frequent geosocial network customers, has generated a surge of 
fake check-ins fSl. Even with GPS verification mechanisms in 
place, committing location fraud has been largely simplified by 
the recent emergence of specialized applications for the most 
popular mobile eco-systems (LocationSpoofer fj] for iPhone 
and GPSCheat |8| for Android). 

We propose first a venue centric Profil/j . To relieve the 
GSN provider from a costly involvement in venue specific 
activities, ProfiL/j stores and builds LCPs at venues. Partic- 
ipating venue owners need to deploy an inexpensive device 
inside their business, allowing them to perform LCP related 
activities and verify the physical presence of participating 
users. We extend Profile with the notion of snapshot LCPs, 
built by user devices from the profiles of co-located users, 
communicated over ad hoc wireless connections. Snapshot 
LCPs are not bound to venues, but instead user devices 
can compute LCPs of neighbors at any location of interest. 
ProfiL/j relies on (Benaloh's) homomorphic cryptosystem 
and zero knowledge proofs to enable oblivious and provable 
correct LCP computations. 

We further introduce iSafe, a context aware safety appli- 
cation, that uses Profil^ to privately build safety LCPs. 
The constant population density increase, and the recent surge 
of natural and man-made disasters, riots and lootings, make 
safety aware applications of paramount importance. The goal 
of iSafe is to make users aware of the safety of their sur- 
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Fig. 1. Yelp venue stats: (a) Distiibution of the number of Yelp reviews per 
venue, (b) Distribution of the distance from the venue "Ike's Place" to the 
home cities of its reviewers. 



roundings while preserving the privacy of participants. Safety 
information can empower a suite of appHcations, including 
safe walking/evacuation directions and safety dependent mo- 
bile authentication. 

We implemented iSafe and Profil^ as mobile application 
and browser plugin components. Our experiments show that 
on a smartphone, with a client cheating probability of 1 in a 
million, the end-to-end overhead of an LCP update operation is 
2.5s. We further rely on data collected from Yelp [1], a popular 
geosocial network, to build user and venue safety labels. The 
iSafe browser plugin introduces an overhead of under Is for 
collecting and processing 500 Yelp reviews. 

The paper is organized as follows. Section |II] describes 
the system and adversary model and defines the problem. 
Section |lll] introduces PROFIL7? and proves its privacy and 
correctness. Section |IV]introduces snapshot LCPs and presents 
the distributed, real-time variant of PROFlLij . SectionlVjintro- 
duces iSafe and its implementation. Section |VT] evaluates the 
performance of the proposed constructs. Section IVlIJ describes 
related work and Section FVIIII concludes. 

II. Background and Model 

We model the geosocial network (GSN) after Yelp |]T]. 
It consists of a provider, S, hosting the system along with 
information about registered venues, and serving a number of 
subscribers. To use the provider's services, a client application 
needs to be downloaded and installed. Users register and 
receive initial service credentials, including a unique user id. 
We use the terms subscriber and user interchangeably to refer 
to users of the service and the term client to denote the 
software provided by the service and installed by users on 
their devices. 

The provider supports a set of businesses or venues, with an 
associated geographic location (e.g., restaurants, yoga classes, 
towing companies, etc). Users are encouraged to write reviews 
for visited locations, as well as report their location, through 
check-ins at venues where they are present. 

Participating venue owners need to install inexpensive 
equipment (e.g., a $25 Raspberry PI ["91, a BeagleBoard or 
any Android smartphone). Such equipment can also be used 
for other tasks including detecting fake user check-ins [[TOl and 
preventing fake badges and incorrect rewards, and validating 



social network (e.g.. Yelp lH]) reviews, thus eliminating fake 
negative reviews. The advantages provided by such solutions 
can motivate the small investment. 

We have collected data from 16,199 venues throughout the 
U.S.. Besides the name, location and type of venue, we have 
also collected all the reviews provided for these venues, for a 
total of 1,096,044 reviews. For each review we extracted the 
reviewer id, the date the review was written and the number 
of check-ins performed. Moreover, we have collected data 
from 10,031 Yelp users, including their id, location, number 
of friends and reviews, for a total of 646,017 reviews. Figure 
|l(a)| shows the long-tail distribution of the number of reviews 
per venue, for the collected venues. 

A. Location Centric Profiles 

Each user has a profile Pjj — {wi, U2, .., w^}, consisting 
of values on d dimensions (e.g., age, gender, home city, etc). 
Each dimension has a range, or a set of possible values. Given 
a set of users U at location L, the location centric profile 
at i, denoted by LCP{L) is the set {81,82, ■.,Sd}, where 
Si denotes the aggregate statistics over the i-th dimension of 
profiles of users from U. 

In the following, we focus on a single profile dimension, 
D. We assume D takes values over a range R that can 
be discretized into a finite set of sub-intervals (e.g., set of 
continuous disjoint intervals or discrete values). Then, given 
an integer h, chosen to be dimension specific, we divide R into 
h intervals/sets, Ri, ..,Ri,. For instance, gender maps naturally 
to discrete values (b — 2), while age can be divided into 
disjoint sub-intervals, with a higher b value. We define the 
aggregate statistics 8 for dimension D of LCP{L) to consist 
of h counters ci, ..,Cfc; q records the number of users from 
U whose profile value on dimension D falls within range Ri, 
i = l..b. 

Figure [T(b)| illustrates an LCP dimension: the distribution of 
the (great-circle) distance in miles from a venue ("Ike's Place" 
in San Francisco, CA) and the home cities of its (4000-I-) 
reviewers. Note that more than 3000 reviews were left by 
locals, information that can be used by the venue to better 
cater to its customers. 

B. Private LCP Requirements 

We define a private LCP solution to be a set of func- 
tions, PP{k) = {8etup, Spoter, Checkin, Pub8tats}, see 
Figure |2] Setup is run by each venue where user statistics 
are collected, to generate parameters for user check-ins. To 
perform a check-in, a user first runs Spoter, to prove her 
physical presence at the venue. Spoter returns error if the 
verification fails, success otherwise. If Spoter is successful, 
Checkin is run between the user and the venue, and allows 
the collection of profile information from the user Specifically, 
if the user's profile value v on dimension D falls within 
the range Ri, the counter c; is incremented by 1. Finally, 
PubStats publishes collected LCPs. 

Let Cv be the set of counters defined at a venue V. Let Cy 
denote the set of b sets of counters derived from Cy, such that 
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Fig. 2. Solution arcliitecture (fc=2). The red an'ows denote anonymous 
communication channels, whereas black arrows indicate authenticated (and 
secure) communication channels. 



each set in Cy has exactly one counter incremented over the 
set Cv- A private LCP solution needs to satisfy the following 
properties: 

Location Correctness: Let A denote an adversary that con- 
trols the GSN provider and any number of users. Let C be 
a challenger that controls a venue V. A running as a user 
U not present at V, has negligible probability to successfully 
complete Spoter at V. 

LCP Correctness: Let A denote an adversary that controls the 
GSN provider and any number of users. Let C be a challenger 
that controls a venue V. Let Cy denote the set of counters 
at V before A runs Checkin at V and let Cy be the set 
of counters afterward. If Cy ^ Cy, the Checkin completes 
successfully with only negligible probability. 
fc-Privacy: Let A denote an adversary that controls any 
number of venues and let C denote a challenger controlling 
k users. C runs Spoter followed by Checkin at a venue V 
controlled by A on behalf of i < k users. Let C, denote the 
resulting counter set. For each j = l..b, A outputs c', its 
guess of the value of the j-th counter of Ci. The advantage of 



l/{i + l)\, defined for each 



A, Adv{A) = \Pr[C,[j] =c 
j ~ l..b, is negligible. 

Check-In Indistinguishability (CI-IND): Let a challenger C 
control two users Uq and Ui and let an adversary A control 
any number of venues. A generates randomly q bits, bi, ..,bq, 
and sends them to C. For each bit bi, i — l..q, C runs Spoter 
followed by Checkin on behalf of user Ubi ■ At the end of 
this step, C generates a random bit b and runs Spoter followed 
by Checkin on behalf of Ub at a venue not used before. A 
outputs a bit &', its guess of b. The advantage of A, Adv{A) — 
\Pr[b' = b]-l/2\ is negligible. 

C. Attacker Model 

We assume venue owners are malicious and will attempt to 
learn private information from subscribers. Clients installed by 
users can be malicious, attempting to bias LCPs constructed at 
target venues. We assume the GSN provider does not collude 
with venues, but will try to learn private user information. 



D. Cryptographic Tools 

Homomorphic Cryptosystems. We use the Benaloh cryp- 
tosystem 111], an extension of the Goldwasser-Micali |12|. 
It consists of three functions {K, E, D), defined as follows: 

> K{k) - key generation: k, an odd integer, is the size of the 
input block. Select two large primes p and q such that fc| (p— 1) 
and gcd{k, [p — l)/fc) = 1 and gcd{k, g — 1) = 1. Let n = pq. 
Select y e Z,*, such that j/(P"1)(«^i)/'= mod n ^ I. n and y 
are the public key and p and q are the private key. 

> E{u,m): Encrypt message m G Z^, using a randomly 
chosen value u G Z* . Output y"^u'' mod n. 

> D{z): Decrypt ciphertext z. Let z = ^"'m'^ mod n. If 
2;(p-i)('?-i)/'" — 1, then return m = 0. Otherwise, for i = l..k, 
compute Si — y~^z mod n. If Si — 1, return m — i. 

Benaloh's cryptosystem is additively homomorphic: 
E{ui,mi)E{u2,rn2) = E{uiU2,mi+m2)- We further define 
the re-encryption function RE{v,E{u,m)) to be y™-u^v^ = 
E{uv, to). Note that the re-encryption function can be invoked 
without knowledge of the message m. Furthermore, it is 
possible to show that two ciphertexts are the encryption 
of the same plaintext, without revealing the plaintext. That 
is, given E{u,m) and E{v,m), reveal w = u^^v. Then, 
E{v,m) — RE{w, E(u,m)). 

Anonymizers. We use an anonvmizerlflll. llT4l . lITSl . ([161 that 
(i) operates correctly - the output corresponds to a permutation 
of the input and (ii) provides privacy - an observer is unable to 
determine which input element corresponds to a given output 
element in any way better than guessing. In the following we 
denote the anonymizer by Mix. 

Secret Sharing. Our constructions use a {k,n) threshold 
secret sharing (TSS) [17] solution. Given a value R, TSS 
generates n shares such that at least k shares are needed to 
reconstruct R. A (fc,n)-TSS solution satisfies the property of 
hiding: An adversary (provided with access to a TSS oracle) 
controlling the choice of two values Rq and i?i and given less 
than k shares of Rb, b €ji {0, 1}, can guess the value of b 
with probability only negligible higher than 1/2. 

III. PROFILfl 

Let SPOTRy denote the device installed at venue V. For 
each user profile dimension D, SPOTRy stores a set of 
encrypted counters - one for each sub-range of R. 

Solution overview: Initially, and following each cycle 
of k check-ins executed at venue V, SPOTRy initiates Setup, 
to request the provider S to generate a new Benaloh key pair. 
Thus, at each venue time is partitioned into cycles: a cycle 
completes once k users have checked-in at the venue. The 
communication during Setup takes place over an authenticated 
and secure channel (see Figure |2). 

When a user U checks-in at venue V, it first engages in the 
Spoter protocol with SPOTRy . As shown in Figure |2] this 
step is performed over an anonymous channel, to preserve 
the user's (location) privacy. Spoter allows the venue to 
verify f/'s physical presence through a challenge/response 
protocol between SPOTRy and the user device. Furthermore, a 



successful run of Spoter provides U with a share of the secret 
key employed in the Benaloh cryptosystem of the current 
cycle. For each venue and user profile dimension, S stores 
a set Sh of shares of the secret key that have been revealed 
so far. 

Subsequently, U runs Checkin with SPOTRy , to send its 
share of the secret key and to receive the encrypted counter 
sets. As shown in Figure |2] the communication takes place 
over an anonymous channel to preserve C/'s privacy. During 
Checkin, for each dimension D, U increments the counter 
corresponding to her range, re-encrypts all counters and sends 
the resulting set to SPOTRy . U and SPOTRy engage in a 
zero knowledge protocol that allows SPOTRy to verify [/'s 
correct behavior: exactly one counter has been incremented. 
SPOTRy stores the latest, proved to be correct encrypted 
counter set, and inserts the secret key share into the set Sh. 

Once k users successfully complete the Checkin proce- 
dure, marking the end of a cycle, SPOTRy runs PubStats to 
reconstruct the private key, decrypt all encrypted counters and 
publish the tally. The communication during PubStats takes 
place over an authenticated channel (see Figure |2}. 

A. The Solution 

Let Ci denote the set of encrypted counters at V, following 
the i-th user run of Checkin. d — {Ci[l], ..,Ci[b]}, where 
Ci[j] denotes the encrypted counter corresponding to Rj, the 
j-th sub-range of R. We write Ci[j] — E{uj,u'j,Cj,j) = 
[E{uj, Cj), E{u'j,j)], where Uj and u' are random obfuscating 
factors and E{u, m) denotes the Benaloh encryption of mes- 
sage m using random factor u. That is, an encrypted counter 
is stored for each sub-range of domain R of dimension D. 
The encrypted counter consists of two records, encoding the 
number of users whose values on dimension D fall within a 
particular sub-range of R. 
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the j-th record with two random values Vj and w': 
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[E{ujVj,Cj),E{uyj,j)]. 
Ci[j] + + = E{uj,u',,Cj + l,j) denote the encryption 
of the incremented j-th counter. Note that incrementing the 
counter can be done without decrypting Cj [j] or knowing the 
current counter's value: Ci[j] + + = [E{uj,Cj)y,E{u'j,j)] = 
[y^^+'u^^,E{ur,j)] = [E{u„c, + l),Eiur,j)]. 

In the following we use the above definitions to introduce 
ProfiL/; . ProfiL/; instantiates PP(k), where k is the pri- 
vacy parameter The notation P {A{params a) , B{paramsB)) 
denotes the fact that protocol P involves participants A and 
B, each with its own parameters. 

Setup(V(),S(fc)):: The provider S runs the key generation 
function K{k) of the Benaloh cryptosystem (see Section lTl-DI ). 
Let p and q be the private key and n and y the public key. 
S sends the public key to SPOTRy . SPOTRy generates a 
signature key pair and registers the public key with S. For 
each user profile dimension D of range R, SPOTRy performs 
the following steps: 



• Initialize counters ci,..,Cb to 0. b is the number of R's 
sub-ranges. 

.Generate Cq = {E{xi,x[,ci,l), ..,E{xb,x'^,Cb,b)}, 
where Xi,x[, i = l..b are randomly chosen values. Store Cq 
indexed on dimension D. 

> Initialize the share set Stey — 0- 

Spoter(U(K),V(),S(fc)):: U sets up an anonymous con- 
nection with SPOTRy , e.g., by using fresh, random MAC 
and IP address values. SPOTRy initiates a challenge/response 
protocol, by sending to U the currently sampled time T, an 
expiration interval AT and a fresh random value R. The 
user's device generates a hash of these values and sends the 
result back to SPOTRy . SPOTRy ensures that the response 
is received within a specific interval from the challenge (see 
Section [VT] for values and discussion). If the verification 
succeeds, SPOTRy uses its private key to sign a timestamped 
token and sends the result to U . U contacts S over Mix and 
sends the token signed by SPOTRy . S verifies V's signature 
as well as the freshness (and single use) of the token. Let U 
be the i-th user checking-in at V . If the verifications pass and 
i < k, S uses the (fc, n) TSS to compute a share of p (Benaloh 
secret key, factor of the modulus n). Let pi be the share of p. 
S sends the (signed) share pi to U. If i > k, S calls Setup 
to generate new parameters for V. 

CheckIn(U(K, n, V), V(n, y, C,;_i, Skey))- ■ U uses 
the same random MAC and IP addresses as in the previous 
Spoter run. Executes only if the previous run of Spoter is 
successful. Let U be the i-th user checking-in at V. Then, 
Ci-i is the current set of encrypted counters. SPOTRy sends 
Ci-i to U. Let V, L/'s value on dimension D, be within R's 
j-th sub-range, i.e., v ^ Rj. U runs the following steps: 

> Generate b pairs of random values {{vi,v[), ..,{vb,v'i,)}. 
Compute the new encrypted counter set Ci, where the order 
of the counters in Ci is identical to Ci_i: Ci = 
{REivi,v'i,Cimi = l..b,l ^ j} U RE{vj,v'j,C,-i[j] + 

+)■ 

• Send Ci along with the signed (by 5*) share pi of the private 
key p to V. 

If SPOTRy successfully verifies the signature of S on the share 
Pi, U and SPOTRy engage in a zero knowledge protocol ZK- 
CTR (see Section UlLBl ). ZK-CTR allows U to prove that C, 
is a correct re-encryption of Ci-i: only one counter of Ci-i 
has been incremented. If the proof verifies, SPOTRy replaces 
C,;„i with Ci and ads the share pi to the set Skey 

PubStats(V(Cfc,Sh,V),S(p,q)): : SPOTRy performs the 
following actions: 
. If \Sh\ < k, abort. 

• If \Sh\ — k, use the k shares to reconstruct p, the private 
Benaloh key. 

> Use p and q — n/p to decrypt each record in Ck, the final 
set of counters at V . Publish results. 

B. ZK-CTR: Proof of Correctness 

We now present the zero knowledge proof of the set Ci 
being a correct re-encryption of the set Ci_i, i.e., a single 



counter has been incremented. Let ZK-CTR(i) denote the 
protocol run for sets Ci_i and Ci. U and SPOTRy run the 
following steps s times: 

• U generates random values (ti,i'j^), .., {tb,t'^ and random 
permutation vr, then sends to SPOTRy the proof set P,_i — 
Ti{RE(tutlC,_i[l]),l = l..h}. 

• U generates random values (wi, u;']^), .., (^6,10^), then 
sends to SPOTRy the proof set Pi — ii{RE(wi,w[, Ci[l]),l = 
l..b} 

> SPOTRy generates a random bit a and sends it to U. 

» If a — 0, U reveals random values {ti,t[), .., {tb,t'^) and 
{wi,Wi), .., {wb, w'l^). SPOTRy verifies that for each / — l..b, 
RE{ti,t'i,Ci-i[l]) occurs in Pi-i exactly once, and that for 
each / — l..b, RE{wi,w'i,Ci[l]) occurs in Pi exactly once. 

t;-\forall 



viwiti ^ and oj 



u,w, 



I'^iH 



m \f a — 1,U reveals o; 
I = l..b along with j, the position in P,;_i and P,; of the incre- 
mented counter. SPOTRy verifies that for all I = l..b,l ^ j, 
REioi,o'i,P,-i[l]) = P,[/] and P^(o,,o;., P,_i[j]y) = P,[j]. 

m If any verification fails, SPOTRy aborts the protocol. 

C. Preventing Illegal Votes 

For simplicity of presentation, we have avoided the Sybil at- 
tack problem: participants that cheat through multiple accounts 
they control or by exploiting the anonymizer. For instance, 
a rogue venue owner, controlling fc-1 Sybil user accounts 
or simulating fc-1 check-ins, can use ProfiL/j to reveal the 
profile of a real user Conversely, a rogue user (including the 
venue) could bias the statistics built by the venue (and even 
deny service) by checking-in multiple times in a short interval. 

Sybil detection techniques (see Section IVIIb can be used 
to control the number of fake, Sybil accounts. However, the 
use of the anonymizer prevents the provider and the use of 
the unique IP and MAC addresses prevents the venue from 
differentiating between interactions with the same or different 
accounts. In this section we propose a solution, that when 
used in conjunction with Sybil detection tools, mitigates this 
problem. The solution introduces a trade-off between privacy 
and security. 

Specifically, we divide time into epochs (e.g., one day long). 
A user can check-in at any venue at most once per epoch. 
When active, once per epoch e, each user U contacts the 
provider S over an authenticated channel. U and S run a 
blind signature |18| protocol: U obtains the signature of S 
on a random value, Ru,e- S does not sign more than one 
value for U for any epoch. In runs of Spoter and Checkin 
during epoch e, U uses Ru,e as its pseudonym (i.e., MAC and 
IP address). Venues can verify the validity of the pseudonym 
using 5's signature. A venue accepts a single Checkin per 
epoch from any pseudonym, thus limiting the user's impact 
on the LCP. 

D. Analysis 

Given a set of encrypted counters C, let C denote the set 
of re-encryptions of records of C, where only one record has 
its counter incremented. We introduce the following theorem. 

Theorem 1: ZK-CTR(i) is a ZK proof of C^ e C»_i. 



Proof: We need to prove completeness, soundness and 
zero-knowledge. For completeness, if Ci £ Ci-i, in each of the 
s steps, U succeeds to convince S, irrespective of the challenge 
bit a. If a — 0, U can produce the random obfuscating values 
proving that the proof sets Pi_i and Pi are correctly generated 
from Ci-i and Ci. If a — 1, U can build the obfuscating 
factors proving that P,; G Pi-i- 

For soundness, we need to prove that if Ci ^ Ci-i, U cannot 
convince S unless with negligible probability. For simplicity 
reasons, we assume Ci ^ Ci-i due to a single record in 
d being "bad": C^-l[j] = E{uj,u'j,CjJ) and CJj] = 
E{vj , v'j , c' , j'). In any round of the ZK-CTR protocol, U has 
two options for cheating. First, U could count on the bit a 
to come up 0. Then, U builds Pi_i[j] — E{ujtj,u'jt'j,Cj,j) 
and Pi[j] — E{vjWj,Vjw'j,c'pj'). If however a = 1, U has 
to come up with a value Uj, such that RE{aj,E{uj,Cj) = 
E{v'j,c'j) or RE{aj,E{uj,Cj + 1) = E{v'j,c'^). In the first 
case, this means y'^^{ujaj)^ = yc'.v'l' mod n. Without know- 
ing n's factorization, U cannot compute fc's inverse modulo 
0(n). Then, the equation is satisfied only if c' — Cj + zk, for 
an integer z. Note however that Benaloh's cryptosystem only 
works for values in Z^., making this condition impossible to 
satisfy. The second case is similar The second cheating option 
is to assume a will be 1 and build Pi [j] to be a re-encryption 
of Pi_i[j]. It is then straightforward to see that if a = 0, U 
can only succeed in convincing S, if c' — Cj + zk, which we 
have shown is impossible for z ^ 0. Thus, in each round, U 
can only cheat with probability 1/2. Following s rounds, this 
probability becomes 1/2'*. 

We now show that ZK-CTR conveys no knowledge to any 
verifier, even one that deviates arbitrarily from the protocol. 
We prove this by following the approach from fT9l, ["201. 
Specifically, let S* be an arbitrary, fixed, expected polynomial 
time ITM. We generate an expected polynomial time machine 
M* that, without being given access to the client, produces 
an output whose probability distribution is identical to the 
probability distribution of the output of < C, 5* >. 

We now build M* that uses S* as a black box many times. 
Whenever M* invokes S*, it places input x — {Lo,Li) on 
its input tape ITs and a fixed sequence of random bits on 
its random tape, RTs- The input x consists of Lq ~ Cq and 
ii = Ci. The content of the input communication tape for 
S*, CTs will consist of tuples {P2i,P2i+i,T^i), where P2i 
and P2i+i are sets and vr^ is a permutation. The output of M* 
consists of two tapes: the random-record tape RTm and the 
communication-record tape CTm- RTm contains the prefix of 
the random bit string r read by S* . The machine M* works 
as follows (round i): 

Step 1 M* chooses a random bit a Er {0, 1}. If a = 0, M* 
picks a random permutation tt^, generates ti,t\, I — l..b ran- 
domly and computes P2i — TTi{RE{ti,t'i, Ci-i[l]), I — 1..6}. 
It then generates random values wi,w[, I = l..b, randomly and 
computes the set Pji+i = Tri{RE{wi,w'i,Ci[l]),l — 1..6}. 
Note that M* does not need to know the counters to perform 
this operation. If a = 1, M* generates a random set Pji, 
then generates random values oi,o[ randomly, I — l..b.lt then 



generates a random j G l..b and computes P2i+i such that for 
all / = l..bj ^ J, RE{ol,o'l,P2^[l]) = P2^+l[l] and for the 
j-th position, RE{oj,o'j,P2i[j]y) = P2i+i[j]- 
Step 2 M* sets 

b = S*{x,r; Pq, Pi,TTq, .., P2i-2, P2i-l,'^i-li P2i, P2i+l)- 

That is, b is the output of 5** on input x and random string 
r after receiving i — 1 pairs P2J, P2J+1, t^j), j = !■•* — 1 and 
proof P2i, P2i+i on its communication tape CTs- We have the 
following three cases. 

(Case I), a = b = 0. M* can produce ti,t[,wi,w'i, I — 1..6 
and TTi to prove that P2i = TTi{RE{ti,t'i, Ci-i[l]), I = 1..6} 
and P2J+1 = T:t{RE{wi,w[,Q[l])J = l..b}. M* sets bi to 
b, appends the tuple {P2i, P2i+i,TTi, bi) to CTm and proceeds 
to the next round (i+1). 

(Case 2). a = b — 1. M* can produce oi,o[, I = 1..6, and 



index j such that RE{oi,o[,P2i 



P2^+l[l], l^l..b,l^j 



and RE{oj,o'j,P2i[j]y) = P2i+i[j]- M* sets 6,; to &, appends 
the tuple {P2^^ P2i+i,TTi, bi) to CTm and proceeds to the next 
round (i+1). 

(Case 3). a ^b. M* discards all the values of the current 
iteration and repeats the current round (Step 1 and 2). 

If all rounds are completed, M* halts and outputs 
{x,r' ,CTm), where r' is the prefix of the random bits r 
scanned by S* on input x. We first prove that M* terminates in 
expected polynomial time and then that the output distribution 
of M* is the same as the output distribution of S* when 
interacting with the client, on input (Lo,£i). 

Lemma 1: M* terminates in expected polynomial time. 
Proof: Given Co and Ci, during the i-th round P2i and 
P2i+i are either built from Co and Ci or from each other Dur- 
ing each run of round i, the bit a is chosen independently. Then 
P2i and P2J+1 are also chosen independently. This implies that 
the probability that a = 6 is 1/2 and the expected number of 
repetitions of round i is 2. S* is expected polynomial time, 
which implies that M* is also polynomial time. ■ 

Lemma 2: The probabiUty distribution of < C,S* > 
(I/O, i - 1) > and of M*{Lo, Li) are identical. 

Proof: The output of < C, 5* > (io,£i) > and of 
M*{Lq, Li) consists of a sequence of t tuples of format 
{P2i,P2i+i,'^i,bi). Let n 



(a:,r,2) 
M' 



and n^g'^' be the proba- 



bility distributions of the first i tuples output by M* and 
< C,S* >. We need to show that for any fixed random input 






n 



(a;,r,i) 
CS* 



We prove this by induction. The base 



case, where i = 0, holds immediately. In the induction step 



we assume that 11 



{x^r.i) 
M* 



= n 



CS' 



that the i + 1st tuples in 11 



in n 



{x,rA+l) 
CS" 



{x,r,i+l) 
M* 



— T'^^\ We need to prove 



denoted by 11 



(i+1 






and 



We show that 11 



, denoted by H^o, have the same distribution, 
are uniform over the set 



(«+i) 



and n 



(«+i) 

CS' 



V = {iP2^,P2^+l,7^,,b)\b = 5* (x, r, T^ | |P) A ((^2^ = 
TT,RE{Co),P2i+l = TTiREiCi), if b = 0) V iP2^+l[l] = 
RE{P2M,l Y 1-^'^ ^ j,P2^+l[j] - yREiP2,[j]), ifb = 

1)}. For IIp^, , this is the case, by construction. If I^lj, 
has output, it is also uniformly distributed in V. ■ 

M* terminates in expected polynomial time and its output 
has the same distribution as the output of the interaction 



between S* and a client. Thus, the theorem follows. ■ 

We can now prove the following results. 

Theorem 2: PROFiLij is fc-private. 

Proof: (Sketch) Following the definition from Sec- 
tion III-BI let us assume that the adversary A has access to 
an encrypted counter set Ci generated after C has run Spoter 
followed by Check In on behalf of i < k different users. The 
records of set Ci are encrypted and A has i shares of the 
private key. For any j — l..b, let c' be ^'s guess of the value 
of the j-th counter in d. If \Pr[C^ [j] = c^] - l/(fc + 1) | = e is 
non-negligible we can use A to construct an adversary B that 
has e advantage in the (i) semantic security game of Benaloh 
or in the (ii) hiding game of the (fc,n) TSS. We start with 
the first reduction. B generates two messages Mo = and 
Ml = 1 and sends them to the challenger C. C picks a bit 
d G_R {0,1} and sends to B the value E{u,Md), where u 
is random and E denotes Benaloh's encryption function. B 
initiates a new game with A, with counters set to 0. B runs 
Spoter and Checkin (acting as challenger) with A. B re- 
encrypts all counters from A, except the j-th one, which it 
replaces with E{u,AId)- B runs ZK-CTR with A (used as a 
black box) a polynomial number of times until it succeeds. 
A outputs its guess of the values of all counters. B sends the 
guess for the j-th counter to C. The advantage of B in this 
game comes entirely from the advantage provided by A. 

For the second reduction, B runs Setup as the provider and 
obtains the secret key po and pi (renamed from p and q). 
B sends po and pi to the challenger C, as its choice of two 
random values. C generates a random bit a, uses the (/c, n) TSS 
to generate i <k shares of Pa, shi, .., shi, and sends them to 
B. B generates a new random prime q and picks randomly a 
bit d. Let the Benaloh modulus be n = pdq. Then, acting as i 
different users, Uj, j — l..i B runs Spoter with S (which it 
also controls) to obtain S"s signature on shj. For each of the 
i users, B runs Checkin with A. At the end of the process, 
A outputs its guess of the encrypted counters. If the guess is 
correct on more than d/{j + 1) counters, B sends d to C as 
its guess for a. Otherwise, it sends d. Thus, S's advantage in 
the hiding game of TSS is equivalent to A's advantage against 
PROFILi7 . ■ 

Theorem 3: Profil/j ensures location correctness. 

Proof: The user's location is verified in the Spoter 
protocol. A single malicious user, not present at venue V, 
is unable to establish a connection with the device deployed 
at V, SPOTRy . Thus, the user is unable to participate in the 
challenge/response protocol and receive at its completion a 
provider signed share of the Benaloh secret key. Without the 
share, the user is unable to initiate the Checkin protocol. Two 
(or more) attackers can launch wormhole attacks: one attacker 
present at V, acts a a proxy and relays information between 
SPOTRy and a remote attacker This may allow the remote 
attacker to successfully run Spoter and Checkin at V. In 
Section |VI] we present experimental proof that Spoter detects 
wormhole attacks. ■ 

Theorem 4: Profil/j is LCP correct. 



Proof: (Sketch) A user U can alter the LCP of a venue V 
in two ways. First, during the ZK-CTR protocol, it modifies 
more than one counter or corrupts (at least ) one counter. 
The soundness property of ZK-CTR, proved in Theorem [T] 
shows this attack succeeds with probability 1/2*. Second, 
it attempts to prevent V from decrypting the counter sets 
after k users have run Checkln. This can be done by pre- 
venting SPOTRy from reconstructing the private Benaloh key. 
Key shares are however signed by the provider, allowing 
SPOTRy to detect invalid shares. ■ 

Theorem 5: Profile? provides CI-IND. 

Proof: (Sketch) Let A be an adversary that has an e 
advantage in the CI-IND game. We assume the challenger does 
not run Spoter and Checkln twice for the same (user, epoch) 
pair - otherwise the use of the signed pseudonyms provides 
an advantage to A. Note that if pseudonyms are not used, 
this requirement is not necessary. Moreover, no identifying 
information is sent by users during Spoter and Checkln: the 
pseudonyms are blindly signed by S, and all communication 
with S takes places over Mix. ■ 

IV. Snapshot LCP 

We extend ProfiLj,; to allow not only venues but also users 
to collect snapshot LCPs of other, co-located users. To achieve 
this, we take advantage of the ability of most modern mobile 
devices (e.g., smartphones, tablets) to setup ad hoc networks. 
Devices establish local connections with neighboring devices 
and privately compute the instantaneous aggregate LCP of 
their profiles. 

A. Snapshot PROFIL/f 

We assume a user U co-located with k other users Ui, ..,Uk- 
U needs to generate the LCP of their profiles, without in- 
frastructure, GSN provider or venue support. An additional 
difficulty then, is that participating users need assurances 
that their profiles will not be revealed to U. However, 
one advantage of this setup is that location verification 
is not needed: U intrinsically determines co-location with 
Ui,..,Uk- Snapshot ProfiLj,; consists of three protocols, 
{Setup, LCPGen, PubStats}: 

Setup(C/(r), [/i, .., L/feO): : U performs the following 
steps: 

> Run the key generation function K{r) of the Benaloh 
cryptosystem (see Section Hl-Db . Send the pubUc key n and y 
to each user Ui, ..,Uk- 

• Engage in a multi-party secure function evaluation proto- 
col II2TI . II22I with Ui,..,Uk to generate shares of a public 
value R < n. At the end of the protocol, each user Ui has a 
share Ri, such that Ri..Rk = R mod n and Ri is only known 
to U^. 

• Assign each of the k users a unique label between 1 and 
k. Let Ui, ..,Uk denote this order 

« Generate Co — {-E(a;i,xi,0, 1), .., £'(xf,,a;(,,0, 6)}, where 
Xi,x^, i = l..b are randomly chosen. Store Cq indexed on 
dimension D. 




Fig. 3. Static crime indexes computed over crimes reported during 2010 in 
the Miami-Dade county. 



Each of the k users engages in a 1-on-l LCP Gen with U to 
privately and correctly contribute her profile to C/'s LCP. 

LCPGen([/(Ci-i),[/i()): : Let C-i be the encrypted 
counters after Ui,..,Ui^i have completed the protocol with 
U. U sends Ct^i to Ui. Ui runs the following: 

« Generate random values [vi^v'-y), ..,{vb,v'^). Let j be the 
index of the range where Ui fits on dimension D. 

> Compute the new encrypted counter set Ci as: Ci = 
{RE{vi,v'i,C^-i[l])R.i mod n\l = l..b,l ^ j} U 
RE{vj,v',,Ci-i[j] + +)Ri mod n} and send it to U. 

• Engage in a ZK-CTR protocol to prove that Ci £ Ci-i. 
The only modification to the ZK-CTR protocol is that all re- 
encrypted values are also multipUed with Ri mod n, Ui's share 
of the public value R. If the proof verifies, U replaces Ci_i 
with Ci. 

After completing LCPGen with Ui,..,Uk, U's encrypted 
counter set is Ck = {Ej = E{uj,u'j,Cj,j)Ri..Rk\j = l..d}, 
where Uj and u' are the product of the obfuscation factors used 
by Ui,..,Uk in their re-encryptions. The following protocol 
enables U to retrieve the snapshot LCP. 

PubStats([/(Cfc)): : Compute EjK, Vj = l..d, where 
K = R^^ mod n (R = Ri..Rk), decrypt the outcome using 
the private key (p, q) and publish the resulting counter value. 
Even though U has the private key allowing it to decrypt any 
Benaloh ciphertext, the use of the secret Ri values prevents it 
from learning the profile of [/,, i = l..k. 

V. iSafe: Context Aware Safety 

We introduce iSafe, an application built on ProfiLj^ . 
iSafe uses the context of users, in terms of their location, 
time, other people present, to build a safety representation. 
Quantifying the safety of a user based on her current context 
can be further used to provide safe walking directions and 
context-aware smartphone authentication protocols (i.e., more 
complex authentication protocols in unsafe locations). iSafe 
combines information collected from Yelp with Census ll23l 
and historical crime databases as well as context collected by 
the users' mobile devices. We have access to the Miami-Dade 



county 11241 area crime and Census datasets since 2007. Each 
record in the crime dataset is labeled with a crime type (e.g., 
homicide, larceny, robbery) as well as the geographic location 
and time of occurrence. 

iSafe assigns static safety labels to Census-defined geo- 
graphic blocks. While beyond the scope here, we note that the 
safety index is inversely proportional to the weighted average 
of the crimes committed in the block. Figure |3] shows the color- 
coded safety index for each block group in the Miami-Dade 
county (FL) in 2010. iSafe uses the static block safety indexes 
to compute safety labels of mobile users. The safety label of a 
user is an average over the safety indexes of the blocks visited 
by the user Blocks visited more frequently, have an inherently 
higher impact on the user's safety label. Block and user safety 
labels take values in the [0, 1] interval; 1 is the safest label. 

iSafe uses ProfiL/j to privately compute the safety labels 
for Yelp venues: the distribution of safety indexes of users that 
reviewed them. To achieve this, iSafe divides the [0, 1] safety 
range into a discrete set of disjoint sub-intervals, and assigns a 
counter to each sub-interval. Each venue privately retrieves the 
distribution of the safety values of its reviewers (the counters 
of users fitting the corresponding sub-intervals). Finally, the 
safety index of the venue is the weighted average of the 
aggregated counts. The normalized weights are either the 
upper bound value or the middle point of their corresponding 
sub-intervals. 

Besides this venue-centric approach, iSafe also uses snap- 
shot ProfiL/j to privately aggregate the safety labels of co- 
located user devices and distributively obtain the real-time 
image of the safety of their location. 

A. Implementation 

We implemented iSafe as a (i) web server, (ii) a browser 
plugin running in the user's browser and (iii) a mobile ap- 
plication. We use Apache Tomcat 6.0.35 to route requests 
(exposed to the client through a REST API interface) to our 
server-side component. The server-side component relies on 
the latest servlet v3.0 which offers additional features includ- 
ing asynchronous support, making the server-side processing 
much more efficient. We implemented the browser plugin for 
the Chrome browser using HTML, CSS and Javascript. The 
plugin interacts with Yelp pages and the web server, using 
content scripts (Chrome specific components that let us access 
the browser's native API) and cross-origin XMLHttpRequests. 

The browser plugin becomes active when the user navigates 
to a Yelp page. For user and venue pages, the plugin parses 
their HTML file and retrieves their reviews. We employ a 
stateful approach, where the server's DB stores all reviews 
of pages previously accessed by users. This enables significant 
time savings, as the plugin needs to send to the web server only 
reviews written after the date of the last user's access to the 
page. Given the venue's set of reviews, the server determines 
the corresponding reviewers. Since we do not have access 
to the location history of users, to compute a user's security 
label we rely on the venues reviewed by the user: The user 
safety is computed as an average over the safety labels of 




Fig. 4. Snapshot of iSafe's plugin functionality for a Yelp venue. The orange 
circle indicates the venue's safety level. 




(a) (b) 

Fig. 5. Snapshots of iSafe on Android. 



the blocks containing the venues reviewed by the user. Given 
the safety labels of reviewers, we run ProfiL/j to determine 
their distribution and identify the safety level of the venue. 
The server sends back the safety level of the venue, which 
the plugin displays in the browser. Figure |4] shows iSafe's 
extension to the Yelp page of the venue "Top Value Trading 
Inc." in Hialeah, FL (central left yellow rectangle containing 
iSafe's safety recommendations). 

We have also implemented an Android front-end for 
iSafe's snapshot LCPs. We used the standard Java security 
library to implement the cryptographic primitives employed 
by ProfiL/j . For secret sharing, we used Shamir's scheme 
and for digital signatures we used RSA. We also used the 
kS0AP2 librai-y to enable SOAP functionality on the Android 
app. Figure |5] shows a snapshot of the iSafe Android app on a 
Samsung Admire smartphone. We used the Google map API to 
facilitate the location based service employed by our approach. 

VI. Evaluation 

For testing purposes we have used Samsung Admire smart- 
phones running Android OS Gingerbread 2.3 with a 800MHz 
CPU and a Dell laptop equipped with a 2.4GHz Intel Core 
i5 processor and 4GB of RAM for the server For local con- 
nectivity the devices used their 802.1 Ib/g Wi-Fi interfaces. All 
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Fig. 7. ZK-CTR Performance: (a) Dependence on the Benaloh modulus size, 
(b) Dependence on the number of proof rounds. 



reported values are averages taken over at least 10 independent 
protocol runs. 

iSafe: Figure [6(a)] shows the overhead of the iSafe plugin when 
collecting the reviews of a venue browsed by the user, as a 
function of the number of reviews the venue has. It includes 
the cost to request each review page, parse and process the 
data for transfer The experiments were performed on the Dell 
laptop. It exhibits a sub-linear dependence on the number of 
reviews of the venue (under Is for 10 reviews but under 30s 
for 4000 reviews), showing that Yelp's delay for successive 
requests decreases. While even for 500 reviews the overhead 
is less than 5s, we note that this cost is incurred only once per 
venue. Subsequent accesses to the same venue, by any other 
user will no longer incur this overhead. 
Spoter''s wormhole defenses: Wormhole attacks are best 
detected through timing analysis. We have tested Spoter using 
a smartphone connected over ad hoc Wi-Fi to the laptop. The 
round-trip Wi-Fi latency is under 3ms. On the Android device, 
the time required to compute a (SHA-512) hash is 0.6ms. The 
overhead imposed by Spoter on a wormhole attack is the 
Wi-Fi round-trip latency, plus the hash time (0.003ms on the 
laptop operations), plus the wired round-trip communication 
latency. The one-way communication overhead between the 
two attackers, if performed over the wired network, is at least 
19ms (we tested with systems in Miami, San Francisco and 
Chicago). In total, Spoter imposes an overhead on a wormhole 
attack (43ms) that is almost 12 times the overhead imposed 
on an honest user (3.6ms). Thus, wormhole attacks are easily 
detectable in Spoter. 

A. PROFILij Evaluation 

We have first measured the overhead of the Setup operation. 
We set the number of ranges of the domain D to be 5, Shamir's 
TSS group size to 1024 bits and RSA's modulus size to 1024 
bits. Figure [6(b)] shows the Setup overhead on the smartphone 
and laptop platforms, when the Benaloh modulus size ranges 
from 64 to 2048 bits. Note that even a resource constrained 
smartphone takes only 2.2s for 1024 bit sizes (0.9s on a 
laptop). A marked increase can be noticed for the smartphone 
when the Benaloh bit size is 2048 bit long - 13.5s. We note 
however that this cost is amortized over multiple check-in runs. 

We now focus on the most resource consuming component 
of Profile; : the ZK-CTR protocol. We measure the client 



and venue (SPOTRy ) computation overhead as well as their 
communication overhead. We set the number of sub-ranges 
of domain D to 5. We tested the client side running on 
the smartphone and the venue component executing on the 
laptop. Figure |7(a)| shows the dependence of the three costs 
for a single round of ZK-CTR on the Benaloh modulus size. 
Given the more efficient venue component and the superior 
computation capabilities of the laptop, the venue component 
has a much smaller overhead. The communication overhead is 
the smallest, exhibiting a linear increase with bit size. For a 
Benaloh key size of 1024 bits, the average end-to-end overhead 
of a single ZK-CTR round is 135ms. The venue component 
is 29ms and the client component is 106ms. Furthermore, 
Figure |7(b)| shows the overheads of these components as a 
function of the number of ZK-CTR rounds, when the Benaloh 
key size is 1024 bit long. For 30 rounds, when a cheating 
client's probability of success is 2^^°, the total overhead is 
3.6s. 

We further examine the communication overhead in terms of 
bits transferred during ZK-CTR between a client and a venue. 
Let N be the Benaloh modulus size and B the sub-range count 
of domain D. The communication overhead in a single ZK- 
CTR round is ABN + 3BN = 7BN. The second component 
of the sum is due to the average outcome of the challenge 
bit. Figure |6(c)| shows the dependency of the communication 
overhead (in KB) on B, when N = 1024. Even when 
B — 20, the communication overhead is around 17KB. 
Figure |6(c)| shows also the storage overhead (at a venue). 
The storage overhead is only a fraction of the (single round) 
communication overhead, 2BN. For a single dimension, with 
20 sub-ranges, the overhead is 5KB. 

VII. Related Work 

Golle et al. ||251 proposed techniques allowing pollsters to 
collect user data while ensuring the privacy of the users. The 
privacy is proved at "runtime": if the pollster leaks private 
data, it will be exposed probabilistically. Our work also allow 
entities to collect private user data, however, the collectors are 
never allowed direct access to private user data. 

Toubiana et. al ||26| proposed Adnostic, a privacy preserving 
ad targeting architecture. Users have a profile that allows the 
private matching of relevant ads. While ProfiLj,; can be used 
to privately provide location centric targeted ads, its main goal 
is different - to compute location (venue) centric profiles that 
preserve the privacy of contributing users. 

Manweiler et al. ll2Ti proposed SMILE, a privacy -preserving 
"missed-connections" service similar to Craigslist, where the 
service provider is untmsted and users do not have existing 
relationships. The solution is distributed, allowing users to 
anonymously prove to each other the existence of a past 
encounter While we have a similar setup, our work addresses a 
different problem, of privately collecting location centric user 
profile aggregates. 

Location and temporal cloaking techniques, or introducing 
errors in reported locations in order to provide 1-out-of-k 
anonymity have been initially proposed in ll28l . followed by 
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Fig. 6. (a) iSafe browser plugin overhead: Collecting reviews from venues, as a function of the number of reviews, (b) Setup dependence on Benaloh 
modulus size, (c) Storage and communication overhead (in KB) as a function of range count. 



a significant body of work Jig], (SO), EH, |l32l. We note 
that PROFlLi? provides an orthogonal notion of fc-anonymity: 
instead of reporting intervals containing k other users, we 
allow the construction of location centric profiles only when 
k users have reported their location. Computed LCPs hide the 
profiles the users: user profiles are anonymous, only aggregates 
are available for inspection, and interactions with venues and 
the provider are indistinguishable. 

Our work relies on the assumption that participants cannot 
control a large number of fake, Sybil accounts. One way to en- 
sure this property is to use existing Sybil detection techniques. 
Danezis and Mittal |33| proposed a centralized Sybillnfer 
solution based in Bayesian inference. Yu et al. proposed 
distributed solutions, SybilGuard ^3^ and SybilLimit [351, 
that use online social networks to protect peer-to-peer network 
against Sybil nodes. They rely on the fast mixing property of 
social networks and the limited connectivity of Sybil nodes to 
non-Sybil nodes. 

Significant work has been done recently to preserve the 
privacy of users from the online social network provider 
Cutillo et al. ll36l proposed Safebook, a distributed online 
social networks where insiders are protected from external 
observers through the inherent flow of information in the 
system. Tootoonchian et al. JSTij proposed Lockr, a system for 
improving the privacy of social networks. It achieves this by 
using the concept of a social attestation, which is a credential 
proving a social relationship. Baden et al. ||38l introduced 
Persona, a distributed social network with distributed account 
data storage. Sun et al. |39| proposed a similar solution, 
extended with revocation capabilities through the use of broad- 
cast encryption. While we rely on distributed online social 
networks, our goal is to protect the privacy of users while 
also allowing venues to collect certain user statistics. 

VIII. Conclusions 

We have proposed (i) novel mechanisms for building ag- 
gregate location-centric profiles while maintaining the privacy 
of participating users and ensuring their honesty during the 
process and (ii) centralized and distributed, real-time variants 
of the solution, along with applications that can benefit from 



the construction of such profiles. We have shown that our solu- 
tions are efficient, even when executed on resource constrained 
mobile devices. 
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